智能论文笔记

Feature-Level Fusion of Super-App and Telecommunication Alternative Data Sources for Credit Card Fraud Detection

Jaime D. Acevedo-Viloria , Sebastián Soriano Pérez , Jesus Solano , David Zarruk-Valencia , Fernando G. Paulin , Alejandro Correa-Bahnsen

分类：机器学习

2021-11-05

当没有足够的数据来证实客户的身份时，身份盗窃是信贷贷方的主要问题。在超级应用程序中，包含许多不同服务的大型数字平台，此问题更为相关;在一个分支中丢失客户通常意味着在其他服务中丢失它们。在本文中，我们审查了超级应用程序信息，手机线数据和传统信用风险变量的特征级融合的有效性，以便早日检测身份盗窃信用卡欺诈。通过提出的框架，我们在使用投入是替代数据和传统信贷局数据融合的模型时实现了更好的性能，从而实现了0.81的ROC AUC评分。我们从信用贷方的数字平台数据库中评估我们的方法超过大约90,000个用户。评估是使用传统的ML指标进行的，但金融成本也是如此。

translated by 谷歌翻译

Ithaca. A Tool for Integrating Fuzzy Logic in Unity

Alfonso Tejedor Moreno , Jose A. Piedra-Fernandez , Juan Jesus Ojeda-Castelo , Luis Iribarne

分类：人工智能

2023-01-01

Ithaca is a Fuzzy Logic (FL) plugin for developing artificial intelligence systems within the Unity game engine. Its goal is to provide an intuitive and natural way to build advanced artificial intelligence systems, making the implementation of such a system faster and more affordable. The software is made up by a C\# framework and an Application Programming Interface (API) for writing inference systems, as well as a set of tools for graphic development and debugging. Additionally, a Fuzzy Control Language (FCL) parser is provided in order to import systems previously defined using this standard.

translated by 谷歌翻译

A Mutation-based Text Generation for Adversarial Machine Learning Applications

Jesus Guerrero , Gongbo Liang , Izzat Alsmadi

分类：自然语言处理 | 机器学习

2022-12-21

Many natural language related applications involve text generation, created by humans or machines. While in many of those applications machines support humans, yet in few others, (e.g. adversarial machine learning, social bots and trolls) machines try to impersonate humans. In this scope, we proposed and evaluated several mutation-based text generation approaches. Unlike machine-based generated text, mutation-based generated text needs human text samples as inputs. We showed examples of mutation operators but this work can be extended in many aspects such as proposing new text-based mutation operators based on the nature of the application.

translated by 谷歌翻译

Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation

Sérgio Jesus , José Pombal , Duarte Alves , André Cruz , Pedro Saleiro , Rita P. Ribeiro , João Gama , Pedro Bizarro

分类：机器学习

2022-11-24

Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this gap, we present Bank Account Fraud (BAF), the first publicly available privacy-preserving, large-scale, realistic suite of tabular datasets. The suite was generated by applying state-of-the-art tabular data generation techniques on an anonymized,real-world bank account opening fraud detection dataset. This setting carries a set of challenges that are commonplace in real-world applications, including temporal dynamics and significant class imbalance. Additionally, to allow practitioners to stress test both performance and fairness of ML methods, each dataset variant of BAF contains specific types of data bias. With this resource, we aim to provide the research community with a more realistic, complete, and robust test bed to evaluate novel and existing methods.

translated by 谷歌翻译

Privacy-Preserving Machine Learning for Collaborative Data Sharing via Auto-encoder Latent Space Embeddings

Ana María Quintero-Ossa , Jesús Solano , Hernán Jarcía , David Zarruk , Alejandro Correa Bahnsen , Carlos Valencia

分类：机器学习

2022-11-10

Privacy-preserving machine learning in data-sharing processes is an ever-critical task that enables collaborative training of Machine Learning (ML) models without the need to share the original data sources. It is especially relevant when an organization must assure that sensitive data remains private throughout the whole ML pipeline, i.e., training and inference phases. This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data. Thus, organizations can share the data representation to increase machine learning models' performance in scenarios with more than one data source for a shared predictive downstream task.

translated by 谷歌翻译

Proactive Detractor Detection Framework Based on Message-Wise Sentiment Analysis Over Customer Support Interactions

Juan Sebastián Salcedo Gallo , Jesús Solano , Javier Hernán García , David Zarruk-Valencia , Alejandro Correa-Bahnsen

分类：自然语言处理 | 机器学习

2022-11-08

In this work, we propose a framework relying solely on chat-based customer support (CS) interactions for predicting the recommendation decision of individual users. For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America. Consequently, our main contributions and objectives are to use Natural Language Processing (NLP) to assess and predict the recommendation behavior where, in addition to using static sentiment analysis, we exploit the predictive power of each user's sentiment dynamics. Our results show that, with respective feature interpretability, it is possible to predict the likelihood of a user to recommend a product or service, based solely on the message-wise sentiment evolution of their CS conversations in a fully automated way.

translated by 谷歌翻译

Moving Frame Net: SE(3)-Equivariant Network for Volumes

Mateus Sangalli , Samy Blusseau , Santiago Velasco-Forero , Jesus Angulo

分类：计算机视觉 | (统计)机器学习

2022-11-07

Equivariance of neural networks to transformations helps to improve their performance and reduce generalization error in computer vision tasks, as they apply to datasets presenting symmetries (e.g. scalings, rotations, translations). The method of moving frames is classical for deriving operators invariant to the action of a Lie group in a manifold.Recently, a rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. In this paper we significantly improve that approach by reducing the computation of moving frames to only one, at the input stage, instead of repeated computations at each layer. The equivariance of the resulting architecture is proved theoretically and we build a rotation and translation equivariant neural network to process volumes, i.e. signals on the 3D space. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.

translated by 谷歌翻译

Corneal endothelium assessment in specular microscopy images with Fuchs' dystrophy via deep regression of signed distance maps

Juan S. Sierra , Jesus Pineda , Daniela Rueda , Alejandro Tello , Angelica M. Prada , Virgilio Galvis , Giovanni Volpe , Maria S. Millan , Lenny A. Romero , Andres G. Marrugo

分类：计算机视觉 | 机器学习

2022-10-13

Specular microscopy assessment of the human corneal endothelium (CE) in Fuchs' dystrophy is challenging due to the presence of dark image regions called guttae. This paper proposes a UNet-based segmentation approach that requires minimal post-processing and achieves reliable CE morphometric assessment and guttae identification across all degrees of Fuchs' dystrophy. We cast the segmentation problem as a regression task of the cell and gutta signed distance maps instead of a pixel-level classification task as typically done with UNets. Compared to the conventional UNet classification approach, the distance-map regression approach converges faster in clinically relevant parameters. It also produces morphometric parameters that agree with the manually-segmented ground-truth data, namely the average cell density difference of -41.9 cells/mm2 (95% confidence interval (CI) [-306.2, 222.5]) and the average difference of mean cell area of 14.8 um2 (95% CI [-41.9, 71.5]). These results suggest a promising alternative for CE assessment.

translated by 谷歌翻译

Robust MADER: Decentralized and Asynchronous Multiagent Trajectory Planner Robust to Communication Delay

Kota Kondo , Jesus Tordesillas , Reinaldo Figueroa , Juan Rached , Joseph Merkel , Parker C. Lusk , Jonathan P. How

分类：机器人

2022-09-27

尽管沟通延迟可能会破坏多种系统，但大多数现有的多基因轨迹计划者都缺乏解决此问题的策略。最先进的方法通常采用完美的通信环境，这在现实世界实验中几乎是现实的。本文介绍了强大的Mader（RMADER），这是一个分散的异步多轨迹计划者，可以处理代理商之间的通信延迟。通过广播新优化的轨迹和忠实的轨迹，并执行延迟检查步骤，Rmader即使在通信延迟下也能够保证安全。Rmader通过广泛的仿真和硬件飞行实验得到了验证，并获得了100％的无碰撞轨迹生成成功率，表现优于最先进的方法。

translated by 谷歌翻译

Mapless Navigation of a Hybrid Aerial Underwater Vehicle with Deep Reinforcement Learning Through Environmental Generalization

Ricardo B. Grando , Junior C. de Jesus , Victor A. Kich , Alisson H. Kolling , Rodrigo S. Guerra , Paulo L. J. Drews-Jr

分类：机器人 | 人工智能

2022-09-13

先前的工作表明，深-RL可以应用于无地图导航，包括混合无人驾驶空中水下车辆（Huauvs）的中等过渡。本文介绍了基于最先进的演员批评算法的新方法，以解决Huauv的导航和中型过渡问题。我们表明，具有复发性神经网络的双重评论家Deep-RL可以使用仅范围数据和相对定位来改善Huauvs的导航性能。我们的深-RL方法通过通过不同的模拟场景对学习的扎实概括，实现了更好的导航和过渡能力，表现优于先前的方法。

translated by 谷歌翻译